fix: add batch size limits for embedding models to support Aliyun Bailian #7704

roomote · 2025-09-05T12:05:07Z

Summary

This PR addresses Issue #7702 by implementing configurable batch size limits for embedding models, specifically to support Aliyun Bailian's Qwen3-Embedding models which have a maximum batch size of 10 items.

Problem

The Qwen3-Embedding and text-embedding-v4 models from Aliyun Bailian have a strict batch size limit of 10 items per request. When indexing codebases with more than 10 chunks, the embedding API returns an error:

HTTP 400 - Value error, batch size is invalid, it should not be larger than 10

Solution

Added batch size configuration to embedding model profiles
- Added maxBatchSize property to the EmbeddingModelProfile interface
- Configured Qwen3-Embedding and text-embedding-v4 models with a 10-item batch limit
Updated OpenAICompatibleEmbedder to respect batch limits
- Modified the batching logic to check both token limits and item count limits
- Ensures batches never exceed the model-specific maximum batch size
Updated service factory to propagate batch limits
- Passes model-specific batch size limits to embedders
- Applies limits to scanner and file-watcher components

Testing

Added comprehensive test suite for batch size limiting functionality
All existing tests pass without regression
New tests verify:
- Model-specific batch size limits are respected
- Batching works correctly with mixed text sizes
- Aliyun Bailian models are properly limited to 10 items per batch

Impact

Backward compatible: Models without batch size limits continue to work as before
Configurable: Easy to add batch size limits for other models in the future
Tested: Comprehensive test coverage ensures reliability

Fixes #7702

Important

Adds batch size limits for embedding models to support Aliyun Bailian, updating embedders and service factory to respect these limits.

Behavior:
- Adds maxBatchSize to EmbeddingModelProfile in embeddingModels.ts for model-specific batch size limits.
- Updates OpenAICompatibleEmbedder in openai-compatible.ts to respect maxBatchSize.
- Updates service-factory.ts to propagate batch size limits to embedders, scanners, and file-watchers.
Testing:
- Adds openai-compatible-batch-limit.spec.ts to test batch size limiting functionality.
- Tests ensure model-specific batch size limits are respected and batching works with mixed text sizes.
Misc:
- Updates service-factory.spec.ts to mock getModelMaxBatchSize and test embedder creation with batch size limits.

^{This description was created by}^{for b5b90b0. You can customize this summary. It will automatically update as commits are pushed.}

- Add maxBatchSize property to EmbeddingModelProfile interface - Add batch size limits for Aliyun Bailian models (qwen3-embedding, text-embedding-v4) - Update OpenAICompatibleEmbedder to respect model-specific batch limits - Update service factory to pass batch size limits to embedders and processors - Add comprehensive tests for batch size limiting functionality Fixes #7702

roomote

Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.

roomote · 2025-09-05T12:09:49Z

src/services/code-index/service-factory.ts

+			if (modelMaxBatchSize && modelMaxBatchSize < batchSize) {
+				batchSize = modelMaxBatchSize
+			}
+		}


I notice there's duplicate logic here between createDirectoryScanner (lines 182-188) and createFileWatcher (lines 216-222) for checking model-specific batch size limits. Could we extract this into a helper method like getEffectiveBatchSize() to avoid the duplication?

roomote · 2025-09-05T12:09:49Z

src/shared/embeddingModels.ts

 		},
+		// Aliyun Bailian models with batch size limits
+		"qwen3-embedding": { dimension: 1536, scoreThreshold: 0.4, maxBatchSize: 10 },
+		"text-embedding-v4": { dimension: 1536, scoreThreshold: 0.4, maxBatchSize: 10 },


Are there other Aliyun Bailian models that might need similar batch size limits? Currently we only have qwen3-embedding and text-embedding-v4 configured. It might be worth checking their documentation for other models that could benefit from this.

roomote · 2025-09-05T12:09:49Z

src/shared/embeddingModels.ts

 	dimension: number
 	scoreThreshold?: number // Model-specific minimum score threshold for semantic search
 	queryPrefix?: string // Optional prefix required by the model for queries
+	maxBatchSize?: number // Maximum number of items that can be sent in a single batch


Consider adding more detailed documentation here to explain when and why this limit is needed. For example: 'Maximum number of items that can be sent in a single batch. Some providers (e.g., Aliyun Bailian) impose strict batch size limits on their embedding APIs.'

roomote · 2025-09-05T12:09:49Z

src/services/code-index/embedders/__tests__/openai-compatible-batch-limit.spec.ts

+			expect(mockEmbeddingsCreate.mock.calls[0][0].input).toHaveLength(10)
+			expect(result.embeddings).toHaveLength(10)
+		})
+	})


Nice comprehensive test coverage! Consider adding one more edge case test: what happens when a single text item exceeds both the token limit AND we have a batch size limit? This would ensure the warning is still logged correctly and the item is skipped as expected.

daniel-lxs · 2025-09-05T23:15:18Z

Closing, see #7702 (comment)

roomote bot requested review from cte, jr and mrubens as code owners September 5, 2025 12:05

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Sep 5, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Sep 5, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Sep 5, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Sep 5, 2025

roomote bot commented Sep 5, 2025

View reviewed changes

roomote bot mentioned this pull request Sep 5, 2025

Code Index has no limit on batch size #7702

Closed

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 5, 2025

daniel-lxs closed this Sep 5, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 5, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 5, 2025

daniel-lxs deleted the fix/qwen-embedding-batch-size-limit branch September 5, 2025 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add batch size limits for embedding models to support Aliyun Bailian #7704

fix: add batch size limits for embedding models to support Aliyun Bailian #7704

roomote bot commented Sep 5, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Sep 5, 2025

Uh oh!

roomote bot Sep 5, 2025

Uh oh!

roomote bot Sep 5, 2025

Uh oh!

roomote bot Sep 5, 2025

Uh oh!

daniel-lxs commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: add batch size limits for embedding models to support Aliyun Bailian #7704

fix: add batch size limits for embedding models to support Aliyun Bailian #7704

Conversation

roomote bot commented Sep 5, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Testing

Impact

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

roomote bot commented Sep 5, 2025 •

edited by ellipsis-dev bot

Loading